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I.  EXECUTIVE  SUMMARY 

In  this  work,  an  analytical  expression  is  developed  for  the  differential  entropy  of  a  sinusoid.  The  mutual  information 
between  two  sinusoids  with  different  amplitudes  and  phase  is  then  derived  using  both  continuous  and  discrete  entropies. 


II.  INTRODUCTION 

One  of  the  fundamental  questions  in  information  theory  is  the  maximum  level  of  data  compression.  Shannon  [1] 
provided  the  answer  to  this  question  by  demonstrating  that  random  processes  have  an  irreducible  complexity  beyond 
which  no  compression  is  possible.  This  quantity  is  called  the  entropy  of  the  random  process,  and  this  concept  is 
currently  of  great  interest  in  communication  theory  studies. 

The  concept  of  entropy  is  discussed  in  several  books  (see  e.g.  [2,  3]).  Usually  its  formulation  is  first  given  for 
discrete  random  variables;  that  is,  if  A  is  a  discrete  random  variable  with  values  {aq,  x%,  •  •  •  ,  xm ,  •  •  •  }  and  probability 
density  p(X)  defined  by  p(xm )  =  Prob{X  =  xm},  the  entropy  is  expressed  as 

H(X)  =  -  ^2p(xm)\og[p(xm)}. 

m 

This  log  is  usually  taken  to  be  log2  and  then  the  entropy  is  given  in  units  of  bits.  If  the  log  is  taken  in  the  base  e, 
then  the  entropy  is  written  as  He(X)  and  is  given  in  units  of  “nats”.  This  (discrete)  entropy  has  many  interesting 
properties,  one  of  which  is  that  it  is  non-negative;  it  provides  a  way  to  quantify  the  information  content  of  a  probability 
distribution. 

This  concept  is  extended  to  the  case  of  a  continuous  random  variable,  and  is  then  called  the  differential  entropy. 
For  a  continuous  random  variable  X  with  probability  density  function  p(x) ,  the  differential  entropy  is  given  by: 

h(X)  =  -  J p(x)  log \p(x)]dx 
S 

where  S  =  {x\ p(x)  >  0}  is  the  support  set  of  X.  Again  log  usually  means  log2.  If  the  log  is  taken  to  the  base  e, 
the  notation  he(X)  is  utilized.  Many  properties  of  (discrete)entropy  carry  over  to  differential  entropy;  however,  the 
differential  entropy  may  take  on  negative  as  well  as  positive  values.  The  differences  between  discrete  and  continuous 
entropy  are  discussed  in  [4]  where,  in  fact,  an  alternative  measure  of  information  content  is  proposed. 

In  [2]  tables  of  differential  entropies  are  given  for  various  probability  distributions.  However,  in  signal  detection 
research  the  prototypical  case  is  the  detection  of  a  sinusoidal  signal  in  background  noise.  For  example,  suppose  the 
signal  is  written  as: 


y  =  Asin(0). 

where  0  is  uniformly  distributed  on  [ — 7r,  7r]  (A  similar  discussion  applies  if  y  =  Asin(cc0)  or  y  =  Asin(0  +  <j> )),  where 
w  and  tj>  are  constants.  Restricting  0  to  the  interval  [—  on  which  y  monotonically  increases  through  its  range  of 
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FIG.  1:  Probability  Density  Function  of  a  Sine  Wave 


values  permits  the  use  of  the  transformation  of  variables  technique  to  calculate  the  probability  density  function  of  y 

1 


p(y)  = 


\dy\ 
I  de  l 


Acos(9) 

1 


-  A  <  y  <  A 


(1) 


t Ty/A2  -  y 

as  depicted  in  Figure  1  This  probability  distribution  has  zero  mean  with  variance  a2  =  A2 / 2. 

The  sine  wave  differential  entropy  is  important  for  signal  detection  applications  and  does  not  appear  in  the  literature. 
Its  computation  proceeds  as  follows: 


he(y)  =  -  j  p(y)\n[p{y)\dy 
-A 


—A 

A 


:ln[- 


y2  A2  -  y2 


}dy 


2 
7 r 


yj A2  -y2 


ln[7r yj A2  -  y2)dy. 


(2) 


./A2  —y2  _  / - 

Making  the  change  of  variables:  w  =  v  A — ,  dw  =  ^dy,  and  y  =  Ay/ 1  —  w 2  leads  to 


1 

he(y)  =  ~  J  MaAw\ 


dw 


y/1  —  w 2 


2 

7T 


0 


In  (w)duL 
y/1  —  w' 


2  /' 
+  —  ln[7r^4] 


dw 


=  —  (—  ^Zn[2])  4 — In [tt A]  sin  1(i 
7T  2  7 r 


o 


y/1  —  w2 


(3) 


where  the  first  integral  is  given  in  [5]  (formula  4.241(7)).  The  differential  entropy  for  a  sine  wave  is  therefore 

he(y)  =  —  In  [2]  +  ln[7Tj4] 

-M^l 


(4) 


3 


or  in  bits 


%) 


log  2[e]he(y) 


1 

M2] 


1  lnA] 
ln[^] 


=  log2[ 


(5) 


Note  that  h(y)  can  be  positive  or  negative,  depending  on  the  value  of  A. 

Due  to  the  ubiquity  of  analog-to-digital  converters  for  transforming  physical  measurements  into  digital  format,  it 
is  instructive  to  also  examine  the  discrete  entropy  of  the  sine  wave.  For  this  purpose  we  construct  a  new  discrete 
variable  x)q  as  follows: 

The  random  variable  y  =  Asin(0)  ranges  between  —A  and  A\  so  we  cover  the  interval  [—A,  A]  with  2^  +  1  bins, 
each  of  width  Q  =  =  2Mi  ■  One  bin  is  centered  about  0,  then  there  are  1/2(2N)  bins  for  positive  values  with 

the  last  one  centered  about  +A  and  1/2(2N)  bins  for  negative  values  with  the  last  one  centered  about  —A.  Then  we 
define  t/q  to  be  the  value  at  the  center  of  each  bin;  that  is: 


VQ 


iQ  for  iQ  —  Q/2  <  y  <  iQ  +  Q/2 


(6) 


for  i  =  — 2JV~1,  ■  ■  ■  ,  —2,  —1, 0, 1,  2,  •  •  •  ,  2W~1.  We  now  define  the  probability  distribution  for  yg  so  that  the  probability 
of  each  value  of  j/q  is  the  same  as  the  integral  of  the  probability  density  function  of  the  continuous  random  variable 
y  over  that  bin.  Let  P,  =  Prob  {j/q  =  iQ}.  Then  for  |i|  <  2JV_1, 

Pi  = 


At  the  extreme  bins,  i.e.  for  |i|  =  2N  1, 


Note  the  symmetry;  that  is,  for  i  ^  0,  Pi  =  P_j.  Now  the  (discrete)  entropy  for  the  discrete  random  variable  j/q  is 
given  by 

2n~1 

H(yQ)  =  - 

2N-1 

=  -  [Po  log2  [P0]  +  2  Y,  pi  log2  [Pi\\ •  (9) 

i=l 

To  provide  an  example  of  this  formula,  calculations  were  made  with  the  parameters  of  a  typical  analog-to-digital 
(A/D)  converter.  The  device  is  an  IV-bit  digitizer;  2N  bins  over  the  full  range  from  -1  to  +1.  So  the  bin  width  Q  is 
1/(2n^1).  In  this  case,  A  =  0.99  represents  the  A/D  converter  full  scale  in  order  to  avoid  truncation  at  the  extreme 
values.  Table  I  shows  the  results  calculated  from  Eqn.  (9)  for  various  values  of  N. 

But  the  relationship  between  the  discrete  entropy  H{yo)  and  the  differential  entropy  h{y)  is  given  in  [2]  as: 

H{vq)  +  log2[<2]  ->  h(y)  as  Q  ->  0. 


fiQ+Q/2 


JiQ-Q/ 2  7 T 


l 


y 


Min  \y/A) \1%+_q% 


-rdy 


1/2 


-iM  - 1/2, 


sin  (^v=r)-sin  (^jv=t) 


2n~ 


(7) 


Pi  = 


1 


dy 


A-Q/2  -Ky/ A2  -  y 

=  ^sin_1(y /Ml  A-Q/2 
1  \n  ■  -in  1  \ 

=  *l2-sm  (1-2») 


(8) 
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TABLE  I:  Theoretical  and  Estimated  Discrete  Entropies  for  Varying  Discretizations 


N 

Discrete  Entropy  H(yQ) 

(N  -  1)  +  h(y) 

2 

1.96 

1.637 

4 

3.86 

3.637 

6 

5.81 

5.637 

8 

7.73 

7.637 

10 

9.66 

9.637 

12 

11.65 

11.637 

14 

13.65 

13.637 

16 

15.64 

15.637 

TABLE  II:  Convergence  of  the  discrete  entropy  to  the  differential  entropy  plus  N  —  1 


In  our  case,  Q  =  1/(2N  *),  so 


h(vq)  ->•  (N  -  1)  +  h(y)  as  N  ->  oo.  (10) 

From  Eqn.  (5),  when  A  =  0.99,  h(y)  =  0.637.  The  right-hand  column  of  Table  I  gives  for  comparison  the  values 
( N  —  1)  +  0.637,  and  the  expected  convergence  is  clearly  shown.  For  a  typical  A/D  converter,  N  would  range  from  8 
to  16. 

In  this  paper  we  wish  to  calculate  the  mutual  information  between  two  sinusoids.  Again,  specific  examples  of  the 
computation  of  mutual  information  for  continuous  probability  distributions  do  not  appear  in  textbooks.  Even  for  the 
Gaussian  case  [2]  does  not  specifically  express  the  mutual  information  between  two  Gaussian  distributions,  although 
all  the  necessary  terms  are  given.  The  following  formulas  are  all  taken  from  [2]  for  normal  variables  X  €  N(px,ax), 
Y  £  N(ny,av): 


h(X)  =  ^  log2  [27retr^] 
h(Y)  =  ^  log2  [27retr,y] 
h(X,Y)  =  ^  log2[(27re)2|A'|] 

=  \  log2[(27r e)  VX  -  (El(X  ~  ~  Ah/)])2)]  (H) 

where  \K\  is  the  covariance  matrix.  Therefore,  we  can  express  the  mutual  information  between  X  and  Y  as 

I(X ,  Y)  =  h{X)  +  h(Y)  -  h(X,  Y) 

=  \  log2[cr^cr2]  -  I  log 2[cr2xal  -  (E[X  -  (J,X\(Y  -  fly)})2} 

=  \  log 2[c2^]  -  \  log2[(l  -  P2)cr2cry] 

=  -^log2[(l-p2)]  (12) 

where  p  is  the  correlation  coefficient.  This  formula  has  been  given  explicitly  so  that  we  can  draw  attention  to  the  two 
extreme  special  cases: 

Case  1:  X  and  Y  are  independent 

P=  o 

I(X;Y)  =  0 


Case  2:  X  and  Y  are  linearly  related  so  X  and  Y  completely  determine  one  another;  that  is,  for  any  given  X  there 
corresponds  one  value  of  Y  (so  X  gives  complete  information  about  Y)  and  vice-versa 
A>  =  ±1 

I(X ;  Y)  =  +oo 
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Now  consider  the  case  of  interest  here,  where 

X  =  j4sin(#)Y  =  Bsm(9  +  (f>) 

are  two  sinusoids  with  9  uniformly  distributed  on  [—7 r,  7r]  and  <j>  a  fixed  phase  angle  between  0  and  2-7t.  In  order 
to  compute  the  mutual  information  between  X  and  Y,  we  first  note  that  a  fixed  value  of  X  corresponds  to  two 
values  of  Y  as  depicted  in  Figure  (2).  Let  w  be  a  specific  value  of  X.  If  0  <  w  <  A  then  9\  =  sin_1(iu/A)  and 
92  =  7r  —  sin_1(u;/H)  and  Y  is  equal  to  either  Bsin(0!  +  <fi)  or  Bsin(92  +  (f>).  If  —A  <  w  <  0,  then  9\  =  sin_1(w/yl) 
and  9'2  =  —n  —  sin_1(w/A)  and  Y  is  equal  to  either  Bsm(9[  +  </>)  or  Bsin(9'2  +  </>).  Hence  the  conditional  probability 
distribution  p(Y\X  =  w)  is  a  discrete  distribution  with  only  two  possible  values  for  the  discrete  conditional  random 
variable.  Since  the  mutual  information  of  two  random  variables  is  the  difference  between  an  entropy  and  a  conditional 
entropy,  it  can  be  computed  in  either  of  two  ways;  that  is,  in  terms  of  differential  entropies: 

I(X,Y)  =  h(Y)-h(Y\X) 

or  in  terms  of  discretized  versions: 

I(X,Y)  =  H(Y)-H(Y\X) 

But  the  differential  entropy  of  a  discrete  random  variable  must  be  —00,  for  the  volume  of  the  support  set  in  11- 
dimensional  space  that  contains  most  of  the  probabilities  of  the  variable  is  2hn.  Since  for  a  discrete  random  variable, 
this  volume  is  clearly  zero,  the  entropy  h  must  be  —00.  Consequently, h  (Y\X)  =  —00  and  /  (X,  Y)  =  +00.  Compare 
this  result  to  the  case  of  the  two  linearly-related  Gaussians  that  also  gave  the  mutual  information  to  be  +00.  In  the 
Gaussian  case,  fixing  X  determines  a  single  value  of  Y.  In  the  sinusoidal  case,  fixing  X  determines  two  values  of  Y, 
but  reducing  the  possible  values  of  Y  from  an  infinite  number  to  just  two  numbers  still  provides  an  ’’infinite  amount 
of  information”  about  Y . 

So  we  have  the  result  that  we  sought,  but  it  is  still  informative  to  examine  the  discretized  calculation  of  the  mutual 
information.  For  this,  the  (discrete)  entropy  H  (Y|X)  is  to  be  computed  for  the  discrete  probability  distribution 
p(Y\X).  For  a  fixed  value  w  of  X  with  0  <  w  <  A  (the  case  for  —  A  <  w  <  0  yields  the  very  same  formulas),  the 
variable  Y  has  values 

y'  =  B  sin  (sin^1  (w/A)  +  (f) 

=  B  cos  (j)  +  cos  (sin-1  ((f))  sin^] 

=  B  f  cos  (j)+  \Jl  -  (f)2  sin  (j) 
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or 


=  B 
=  B 


B  sin  (n  —  sin  1  ( ^ )  +  0) 

cos  <f>  +  cos  (7 r  —  sin-1  (^))  sin0] 


Now  from  the  probability  density  function  for  Y,  p(Y),  we  have 

P{V)~  VB2-(?/')2 


7 tB  a/  cos2  0—  ^  ^/l  —  (  ^  )2  sin  0  cos  0—  (  ^  )  2  cos  20 


and  likewise 


p('y  ^  VB2-(y")2 


B  J cos2  </>+  y/ 1  —  (  ^  )  2  sin  0  cos  0—  (  ^  )  2  cos  20 


Normalization  then  gives  the  marginal  distribution  P  (Y \X  =  w )  for  each  re  to  be  a  discrete  distribution  with  just 
two  values  of  the  variable,  with  probabilities  pw  =  D®+D2 ,  where 


and 


Dx  = 


cos  20 


and  1  —  pw.  For  each  w,  the  discrete  entropy  is  then  given  by 


H  ( pw )  =  -pw  log2 (pw)  -  (1  -  pw)  log2(l  -  pw) 
and  the  conditional  entropy  becomes 


H{Y\X)=  j  1  H  (pw)  dw. 

J-A  7T V  At  —  Wz 

Let  us  call  this  number  H'" .  Since  0  <  H(p.w)  <  1,  it  is  clear  that  0  <  H'"  <  1. 

Consequently,  in  terms  of  discretized  entropies,  where  the  bins  covering  the  range  of  Y  are  of  width  Q  =  l/2Ar_1 

I(X,Y)  =  H(YQ)-H"' 

— >  (iV  —  1)  +  log2  (E^)  —  H'" 

— >  +ooasiV  — >  oo 


since  log2  (^)  and  H'"  are  finite. 


III.  CONCLUSIONS 

This  paper  calculates  the  differential  entropy  for  a  sinusoid  and  compares  it  to  its  discrete  version  for  various 
discretizations.  The  mutual  information  between  two  sinusoids  differing  in  phase  is  shown  to  be  +oo. 
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